Skip to content

fix: multi-project routing and recursive exclude patterns#75

Closed
PatrickSys wants to merge 10 commits intomasterfrom
fix/issue-63-multi-project-routing
Closed

fix: multi-project routing and recursive exclude patterns#75
PatrickSys wants to merge 10 commits intomasterfrom
fix/issue-63-multi-project-routing

Conversation

@PatrickSys
Copy link
Owner

Summary

  • Multi-project routing (Single MCP server instance should work across multiple projects #63): workspace-aware MCP request routing so each project root gets its own index, state, and tool dispatch. Adds project-state.ts, project-discovery.ts, and rewires src/index.ts to route by root.
  • Recursive exclude patterns: fixes index pollution where nested coverage/, dist/, worktrees/ directories leaked into search results because exclude globs lacked **/ prefix. Aligns indexer, file-watcher, and project-discovery ignore lists.

Key changes

  • src/index.ts — per-root state management, MCP roots wiring, tool dispatch routing
  • src/project-state.ts — new: factory + Map-backed per-project state
  • src/utils/project-discovery.ts — new: workspace/monorepo discovery with marker-based detection
  • src/core/indexer.ts — recursive **/ exclude patterns + new directories (worktrees, .claude, .nx, etc.)
  • src/core/file-watcher.ts — aligned ignore list with indexer
  • 4 new test files (multi-project routing, project discovery, project state, exclude patterns)

Test plan

  • npx vitest run — 312 tests pass, 0 failures
  • npx eslint on modified files — clean
  • Manual: run against a repo with git worktrees and verify no worktree hits in search_codebase

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d1197a9393

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1307 to +1310
if (availableProjects.length === 1) {
const project = getOrCreateProject(availableProjects[0].rootPath);
await initProject(project.rootPath, watcherDebounceMs, { enableWatcher: true });
setActiveProject(project.rootPath);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Auto-route when a single root contains discovered subprojects

This only auto-selects when listProjectDescriptors() has exactly one entry. After a roots refresh that leaves one configured monorepo root, discovery also adds its nested projects to that list, so availableProjects.length becomes > 1 even though there is still only one root. In a rootless session where the workspace arrives later, the next tool call will therefore fall through to selection_required instead of routing to the sole root until the caller passes project manually.

Useful? React with 👍 / 👎.

Comment on lines +415 to +417
const nearestBoundary = await findNearestProjectBoundary(absolutePath, containingRoot);
const resolvedProjectPath =
nearestBoundary?.rootPath ?? containingRoot ?? (stats.isDirectory() ? absolutePath : undefined);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Accept file selectors that have no detectable project markers

The new project selector advertises absolute file paths, but this fallback only succeeds when findNearestProjectBoundary() found a marker or the selector itself is a directory. If the caller passes a file from a standalone/exported tree with no .git, .codebase-context, or supported manifest above it, both nearestBoundary and containingRoot are undefined and we return unknown_project. That makes file-path routing unusable in rootless/no-roots sessions for exactly the ad-hoc codebases this change is meant to support.

Useful? React with 👍 / 👎.

Comment on lines +118 to +120
if (directoryNames.has('.git')) {
return {
candidate: { rootPath: directoryPath, evidence: 'repo_root' },

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Detect .git files as repo roots during project discovery

Nested Git submodules and linked worktrees usually expose .git as a file, not a directory. classifyDirectory() only checks directoryNames.has('.git'), so discoverProjectsWithinRoot() and findNearestProjectBoundary() miss those projects entirely. In a workspace that contains a submodule without another manifest marker, selecting a file inside it will climb to the outer root and route tools/resources to the wrong project.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link

greptile-apps bot commented Mar 18, 2026

Greptile Summary

This PR delivers two things: (1) workspace-aware multi-project routing in the MCP server so a single server instance can serve multiple repos/monorepos without one config entry per project, and (2) a fix for index pollution caused by flat (non-recursive) exclude globs that allowed nested coverage/, dist/, and worktrees/ directories to leak into search results.

Key changes:

  • src/index.ts — major rewrite: session-level active-project tracking, LRU watcher eviction (capped at 5 watchers), three-tier project selector (project > active session project > single known project), selection_required structured errors with availableProjects, project-scoped codebase://context/project/<encoded-path> resource URIs.
  • src/project-state.ts — clean factory/registry module (pre-existing or added in an earlier commit; not in the 20-file changed list despite the PR description listing it as new).
  • src/utils/project-discovery.ts — new marker-based workspace/monorepo traversal (discoverProjectsWithinRoot, findNearestProjectBoundary, isPathWithin).
  • src/core/indexer.ts / src/core/file-watcher.ts — exclude patterns upgraded from coverage/** to **/coverage/** plus new directories (worktrees, .claude, .nx, .turbo, .next, .cache, .planning, target, vendor).
  • 4 new test files with good coverage of routing edge cases, exclude patterns, and discovery logic.

Issues found:

  • P1 – unguarded JSON.parse in resolveProjectSelector (src/index.ts): error-code routing parses ToolResponse.content[0].text without a try/catch; an empty string (not null/undefined) bypasses the ?? '{}' fallback and throws SyntaxError.
  • P1 – empty workspaces.packages array treated as a workspace root (src/utils/project-discovery.ts): isWorkspacePackageJson checks Array.isArray(packages) but not packages.length > 0, inconsistently classifying { workspaces: { packages: [] } } as a workspace root while correctly rejecting { workspaces: [] }. This can cause unnecessary deep directory traversal.
  • P2 – buildProjectDescriptor has a hidden side-effect (src/index.ts): calls rememberProjectPath() on every invocation, including error-reporting paths.
  • P2 – directory walker follows symlinks (src/utils/project-discovery.ts): symlinked directories are traversed without verifying they stay within the trusted root.
  • P2 – uniform description prefix on all tools (src/tools/index.ts): the routing preamble is prepended to every tool's description, which may degrade tool-selection accuracy for AI MCP clients.

Confidence Score: 3/5

  • Two P1 correctness bugs should be fixed before merging; the rest is well-structured and well-tested.
  • The recursive-exclude fix and the overall routing architecture are solid and backed by a comprehensive new test suite (312 passing tests). However, the unguarded JSON.parse in the error-code routing path can throw an uncaught SyntaxError in a real MCP request handler, and the inconsistent empty-array guard in isWorkspacePackageJson can produce false workspace classifications that affect directory traversal. Both are edge cases but are reachable in production. The remaining P2 items are design concerns that don't affect correctness in the common path.
  • src/index.ts (resolveProjectSelector JSON parse + buildProjectDescriptor side-effect) and src/utils/project-discovery.ts (isWorkspacePackageJson empty-packages check + symlink traversal)

Important Files Changed

Filename Overview
src/index.ts Core MCP server heavily reworked: adds per-session active-project tracking, LRU watcher eviction (MAX_WATCHED_PROJECTS=5), relative-path and file-path project selectors, project-scoped resource URIs, and a workspace overview fallback. Two issues found: unguarded JSON.parse for error-code routing (can throw on empty string) and side effects inside buildProjectDescriptor.
src/utils/project-discovery.ts New module for workspace/monorepo traversal using marker-based detection. Logic is solid but two issues: isWorkspacePackageJson inconsistently treats packages:[] as a workspace root (unlike the workspaces:[] case), and the walker follows symlinked directories without escaping the trusted root boundary.
src/core/indexer.ts Exclude patterns upgraded from flat globs (coverage/, dist/) to recursive patterns (/coverage/, etc.) and extended with new directories (worktrees, .claude, .nx, .turbo, .next, .cache, .planning, target, vendor). Straightforward and well-tested.
src/core/file-watcher.ts Ignore list aligned with the indexer: adds build, .nx, target, vendor, worktrees, .claude. No logic changes, purely additive.
src/tools/index.ts withProjectDirectory replaced by withProjectSelector which injects both project (new primary selector) and project_directory (deprecated alias) into all tool schemas. Also unconditionally prepends a routing preamble to every tool description, which may affect tool-selection accuracy for AI clients.
src/resources/uri.ts Adds buildProjectContextResourceUri and getProjectPathFromContextResourceUri for project-scoped codebase://context/project/ URIs. Round-trip encoding is tested and correct.
src/project-state.ts Clean factory module (mentioned in PR description as new, though absent from the changed-files list). Exposes getOrCreateProject, getProject, getAllProjects, removeProject, makePaths, normalizeRootKey backed by a module-level Map. No issues found.
tests/multi-project-routing.test.ts Comprehensive test rewrite covering rootless startup, invalid root filtering, lazy indexing, active-project stickiness, monorepo subdirectory selection, file-path resolution, selection_required flows, and project-scoped resource reads. Good coverage.
tests/indexer-exclude-patterns.test.ts New integration test that creates a real temp directory tree with nested coverage/, worktrees/, .claude/, and dist/ directories and verifies only legitimate source files appear in the keyword index. Solid regression guard for the recursive-exclude fix.
tests/project-discovery.test.ts New unit tests for discoverProjectsWithinRoot and findNearestProjectBoundary. Covers marker detection, ignored directories, nearest-boundary resolution, and trust-root escape prevention. Well-structured.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Server as MCP Server (index.ts)
    participant Resolver as resolveProjectForTool
    participant Discovery as project-discovery.ts
    participant State as project-state.ts

    Client->>Server: tools/call (search_codebase, args)
    Server->>Resolver: resolveProjectForTool(args)

    alt args.project or args.project_directory provided
        Resolver->>Resolver: resolveExplicitProjectSelection()
        Resolver->>Resolver: resolveProjectSelector(selector)
        alt absolute path or file:// URI
            Resolver->>Discovery: findNearestProjectBoundary(path)
            Discovery-->>Resolver: DiscoveredProjectCandidate
            Resolver->>State: getOrCreateProject(rootPath)
        else relative subproject path
            Resolver->>Resolver: match against listProjectDescriptors()
            Resolver->>Resolver: join with each knownRoot
            Resolver->>Discovery: resolveProjectFromAbsolutePath()
            Discovery-->>Resolver: ProjectState
        end
        Resolver->>Server: setActiveProject(rootPath)
    else no project selector
        alt activeProjectKey is set
            Resolver->>State: getOrCreateProject(activeProject)
        else one project available
            Resolver->>State: getOrCreateProject(singleProject)
            Resolver->>Server: setActiveProject()
        else multiple projects, no active
            Resolver-->>Server: selection_required error
            Server-->>Client: {status:"selection_required", availableProjects:[...]}
        end
    end

    Server->>Server: initProject (ensureIndexed + watcher)
    Server->>Server: dispatchTool(name, args, createToolContext(project))
    Server->>Server: inject {project, index} into JSON response
    Server-->>Client: {status:"success", results:[...], project:{...}}
Loading

Comments Outside Diff (4)

  1. src/utils/project-discovery.ts, line 2063-2076 (link)

    P1 Inconsistent empty-array check for workspaces.packages

    The first branch correctly guards against an empty workspaces array (parsed.workspaces.length > 0), so a package.json with "workspaces": [] is not treated as a workspace root.

    The second branch does not apply the same guard. Array.isArray([]) is true, so a package.json with "workspaces": { "packages": [] } is incorrectly classified as a workspace root, which forces continueScanning: true and causes deeper (potentially expensive) directory traversal where none is needed.

  2. src/index.ts, line 906-919 (link)

    P2 Unexpected side-effect inside buildProjectDescriptor

    buildProjectDescriptor calls rememberProjectPath() — which updates projectSourcesByKey — on every invocation. Functions named build… conventionally return a value without mutating shared state. Here the side effect means that every place that calls buildProjectDescriptor for display or error-reporting purposes (e.g. inside buildProjectSelectionPayload and buildProjectSelectionError) silently registers the path as a known project, which could subtly inflate listProjectDescriptors() and affect subsequent routing decisions.

    Consider moving the rememberProjectPath call to the explicit registration sites (registerKnownRoot, registerDiscoveredProjectPath, resolveProjectFromAbsolutePath, etc.) and making buildProjectDescriptor a pure getter.

  3. src/utils/project-discovery.ts, line 2141-2176 (link)

    P2 Symlinked directories are followed without a depth-guard bypass

    fs.readdir with { withFileTypes: true } returns Dirent entries whose isDirectory() method follows symlinks. A symlink that points to an ancestor of the current path creates a cycle. The depth >= maxDepth cap limits the damage (max 4 levels), but the walker will still descend into every symlinked directory up to that cap, potentially reading large subtrees or escaping the intended root boundary.

    Consider filtering out symlinked directories before traversal:

    .filter((entry) => !entry.isSymbolicLink())

    or using fs.realpath to verify the resolved path stays within resolvedTrustedRootPath before recursing.

  4. src/tools/index.ts, line 1920-1929 (link)

    P2 All tool descriptions are unconditionally prefixed

    withProjectSelector prepends "Routes to the active/current project automatically when known. " to every tool's description regardless of whether the tool actually does project routing (e.g. get_indexing_status or remember). MCP clients and AI agents read these descriptions for tool selection; adding a uniform preamble to all 10 tools makes them harder to differentiate and may degrade tool-selection accuracy.

    Consider only prepending for tools that perform index-consuming operations (INDEX_CONSUMING_TOOL_NAMES), or making the description update opt-in per tool definition.

Last reviewed commit: "fix: make indexer ex..."

Comment on lines +1323 to +1328
)
};
}

async function resolveProjectForResource(): Promise<ProjectState | undefined> {
const availableRoots = getKnownRootPaths();
if (availableRoots.length !== 1) {
const activeProject = activeProjectKey ? getTrackedRootPathByKey(activeProjectKey) : undefined;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unguarded JSON.parse when routing by error code

resolveProjectSelector reaches into a ToolResponse by parsing its JSON text in order to inspect errorCode. If resolution.response.content?.[0]?.text is an empty string (not null/undefined), the ?? '{}' nullish fallback won't fire — the nullish coalescing operator only substitutes on null/undefined — and JSON.parse('') will throw a SyntaxError that is not caught anywhere in the call chain, crashing the MCP request handler.

In addition, this approach couples routing logic to the serialised wire format: any future change to how buildProjectSelectionError serialises its payload would silently break project resolution here.

Suggested change
)
};
}
async function resolveProjectForResource(): Promise<ProjectState | undefined> {
const availableRoots = getKnownRootPaths();
if (availableRoots.length !== 1) {
const activeProject = activeProjectKey ? getTrackedRootPathByKey(activeProjectKey) : undefined;
let errorCode: string | undefined;
try {
errorCode = (
JSON.parse(resolution.response.content?.[0]?.text || '{}') as {
errorCode?: string;
}
).errorCode;
} catch {
errorCode = undefined;
}
if (errorCode !== 'unknown_project') {
return resolution;
}

@PatrickSys
Copy link
Owner Author

Superseded by #67. The recursive exclude patterns fix (d1197a9) is the only delta — will land separately if needed.

@PatrickSys PatrickSys closed this Mar 18, 2026
@PatrickSys PatrickSys deleted the fix/issue-63-multi-project-routing branch March 18, 2026 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant